Improvement in Connected Mandarin Digit Recognition by Explicitly Modeling Coarticulatory Information
نویسندگان
چکیده
The most successful training scheme for recognition of connected spoken digits is the segmental k-means algorithm, which implicitly captures the coarticulatory information of connected speech iteratively to establish reliable reference patterns. However, when this algorithm is applied to Mandarin digits, the obtained performance is inferior to that of English. Hence, a novel approach is proposed to build reliable reference patterns of connected Mandarin digits. Our method is to partition each training digit into three sections and they represent the coarticulation interacting with the preceding digit, the characteristic of the digit itself and the coarticulation interacting with the succeeding digit respectively. In this manner, the coarticulatory information is caught explicitly. Then we model these three sections separately using three Bayesian templates and a resultant multi-section Bayesian template is constructed for each reference Mandarin digit. The experimental result shows that the new method outperforms the segmental k-means by 3.2% when using a multi-speaker speech database.
منابع مشابه
Duration Modeling in Mandarin Connected Digit Recognition
Digit string recognition is required in many applications which need to recognize numbers such as telephone numbers, credit card numbers, date, etc. In order to design a high performance recognizer, duration information is explored in this study. In a Mandarin connected digit recognizer, insertion and deletion errors amount to more than two thirds of the total recognition errors because there e...
متن کاملPerformance of Mandarin Connected Digit Recognizer with Word Duration Modeling
Digit string recognition is required in many applications such as automatic banking system, database information retrieving system, etc. In order to design a high performance recognizer, duration information is explored in this study. In a Mandarin connected digit recognizer, insertion and deletion errors amount to more than two thirds of the total recognition errors because there exist two mon...
متن کاملNovel filler acoustic models for connected digit recognition
The context-dependent modeling technique is extended to include non-speech ller segments occurring between speech word units. In addition to the conventional context-dependent word or subword units, the proposed acoustic modeling provides an e cient way of accounting for the effects of the surrounding speech on the inter-word non-speech segments, especially for small vocabulary recognition task...
متن کاملAn Efficient Method for Removing Deletion Errors in Quickly-spoken Connected Mandarin Digit String Speech Recognition
Connected Mandarin digit string speech, especially at rapid spoken rate, is very difficult to recognize correctly. In this paper, a new training method named neighboring digits pattern is proposed in order to eliminate most of deletion errors which frequently occur in Mandarin digits speech recognition at high speaking rate when we have enough quickly-spoken speech data as the training set. The...
متن کاملRobust F0 modeling for Mandarin speech recognition in noise
The F0 contour plays an important role in recognizing spoken tonal languages like Mandarin Chinese. However, the discontinuity of F0 between voiced and unvoiced transition has traditionally been a bottleneck in creating a succinct statistical tone model for automatic speech recognition applications. By applying successfully the Multi-Space Distribution (MSD) to tone modeling, we recently report...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Inf. Sci. Eng.
دوره 16 شماره
صفحات -
تاریخ انتشار 2000